-
Notifications
You must be signed in to change notification settings - Fork 33
feat: support zero-point decompression for asymmetric quantization (packed) #463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Fix decompress_weight method in PackedQuantizationCompressor to support unpacking zero-points - Add comprehensive tests for zero-point packing/unpacking with GROUP and CHANNEL strategies - Add end-to-end integration tests for asymmetric quantization workflow - Ensure packed tensors are contiguous for safetensors compatibility Resolves issue referenced in vllm-project/llm-compressor#1704
…move manual creation
…v similarity; cleanup temp usage
2cf7124
to
126fc89
Compare
I've addressed all your concerns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good!
Some questions about the e2e test but otherwise, looks great!
} | ||
|
||
state_dict = model.state_dict() | ||
compressed_state_dict = compressor.compress( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be simpler to use compress_model / decompress_model, both of which will work on the model in memory
compress / decompress are primarily meant for disk
if hasattr(module, param_name): | ||
getattr(module, param_name).data = value.clone() | ||
else: | ||
module.register_parameter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why we need to separately register parameters here?
if hasattr(module, param_name): | ||
getattr(module, param_name).data = value.clone() | ||
else: | ||
module.register_parameter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good pending @dsikka 's comments, thanks for updating!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending @dsikka 's comments! Thank you for this contribution
Follow-up to PR #459 (which was closed due to an accidental history rewrite). The branch has been restored to the pre-rewrite tip (2cf7124).
This PR:
Refs vllm-project/llm-compressor#1704.